Multi-Speaker Localization, Separation and Resynthesis for Next Generation Videconferencing
نویسندگان
چکیده
Videoconference systems have been around the market for a long time. Their aim is to provide a way of carrying out meetings without the need for having physical presence of the participants. However, the sense of realism achieved by these systems is usually far away from the one expected by the people involved in the communication. In this paper, we present several advances in audio signal processing related to the captation, processing and reproduction of participants in a meeting environment. These novel approaches can be integrated into videoconference systems for making the sense of being there as real as possible. This paper is intended to be a brief summary of the work capacities existent in the iTEAM research institute for solving, from both a technical and practical perspective, all the technological challenges that high immersion videoconferencing will bring in the near future.
منابع مشابه
Design and Analysis of Multi-Source Extraction and Localization for Sound-Activated Designation of Automata
In this paper, we utilize speech signal to design sound-activated designation for confining an automaton to specific physical space. This work will present the design and analysis of multi-source extraction and localization to application of ubiquitous sound activation system. To extract a speaker sound from interference sources (such as a babble noise generated from TV playing), a multi-source...
متن کاملSlope at Zero Crossings (zc) of Speech Signal for Multi-speaker Activity Detection
Multi-Speaker activity (MSA) detection helps in detecting the presence of whether the speech signals has a single speaker or multiple speaker speeches in the speech signal. It is easy to calculate the slope at ZCs (zero crossings) of the speech signal and makes a comparison with a suitable threshold (Th). Multi-speaker is declared as and when the zero crossing value exceeds the threshold. The i...
متن کاملA Mathematical Model for Multi-Region, Multi-Source, Multi-Period Generation Expansion Planning in Renewable Energy for Country-Wide Generation-Transmission Planning
Environmental pollution and rapid depletion are among the chief concerns about fossil fuels such as oil, gas, and coal. Renewable energy sources do not suffer from such limitations and are considered the best choice to replace fossil fuels. The present study develops a mathematical model for optimal allocation of regional renewable energy to meet a country-wide demand and its other essential as...
متن کاملSeparation of multiple concurrent speeches using audio-visual speaker localization and minimum variance beam-forming
Speaker segmentation is an important task in multi-party conversations. Overlapping speech poses a serious problem in segmenting audio into speaker turns. We propose an audio-visual speech separation system consisting of an array microphone with eight sensors and an omnidirectional color camera. Multiple concurrent speeches are segmented by fusing the two heterogeneous sensors. Each segmented s...
متن کاملAdaptive beamforming and soft missing data decoding for robust speech recognition in reverberant environments
This paper presents a novel approach to combine microphone array processing and robust speech recognition for reverberant multi-speaker environments. Spatial cues are extracted from a microphone array and automatically clustered to estimate localization masks in the time-frequency domain. The localization masks are then used to blindly design adaptive filters in order to enhance the source sign...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009